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Abstract. Opacity is a general language-theoretic framework in which several secu- 
rity properties of a system can be expressed. Its parameters are a predicate, given as 
a subset of runs of the system, and an observation function, from the set of runs into 
a set of observables. The predicate describes secret information in the system and, 
in the possibilistic setting, it is opaque if its membership cannot be inferred from 
observation. 

In this paper, we propose several notions of quantitative opacity for probabilistic 
systems, where the predicate and the observation function are seen as random vari- 
ables. Our aim is to measure (i) the probability of opacity leakage relative to these 
random variables and (ii) the level of uncertainty about membership of the predicate 
inferred from observation. We show how these measures extend possibilistic opacity, 
we give algorithms to compute them for regular secrets and observations, and we ap- 
ply these computations on several classical examples. We finally partially investigate 
the non-deterministic setting. 

1 Introduction 

Motivations. Opacity [2] is a very general framework where a wide range of security prop- 
erties can be specified, for a system interacting with a passive attacker. This includes for 
instance anonymity or non-interference [3] , the basic version of which states that high level 
actions cannot be detected by low level observations. Non-interference alone cannot capture 
every type of information flow properties. Indeed, it expresses the complete absence of infor- 
mation flow yet many information flow properties, like anonymity, permits some information 
flow while peculiar piece of information is required to be kept secret. The notion of opacity 
was introduced with the aim to provide a uniform description for security properties e.g. 
non-intcrfcrencc, noninference, various notions of anonymity, key compromise and refresh, 
downgrading, etc. [4]. Ensuring opacity by control was further studied in [5]. 

The general idea behind opacity is that a passive attacker should not have worthwhile 
information, even though it can observe the system from the outside. The approach, as 
many existing information flow-theoretic approaches, is possibilistic. We mean by this that 
non determinism is used as a feature to model the random mechanism generation for all 
possible system behaviors. As such, opacity is not accurate enough to take into account 
two orthogonal aspects of security properties both regarding evaluation of the information 
gained by a passive attacker. 

The first aspect concerns the quantification of security properties. If executions leaking 
information are negligible with respect to the rest of executions, the overall security might 
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not be compromised. For example if an error may leak information, but appears only in 1% 
of cases, the program could still be considered safe. The definitions of opacity [6,4] capture 
the existence of at least one perfect leak, but do not grasp such a measure. 

The other aspect regards the category of security properties a system has to assume 
when interacting with an attacker able to make inferences from experiments on the base of 
statistical analysis. For example, if every time the system goes bip, there is 99% chances that 
action a has been carried out by the server, then every bip can be guessed to have resulted 
from an a. Since more and more security protocols make use of randomization to reach some 
security objectives [7,8], it becomes important to extend specification frameworks in order 
to cope with it. 

Contributions. In this paper we investigate several ways of extending opacity to a purely 
probabilistic framework. Opacity can be defined either as the capacity for an external ob- 
server to deduce that a predicate was true (asymmetrical opacity) or whether a predicate 
is true or false (symmetrical opacity). Both notions can model relevant security properties, 
hence deserve to be extended. On the other hand, two directions can be taken towards the 
quantification of opacity. The first one, which we call liberal, evaluates the degree of non- 
opacity of a system: how big is the security hole? It aims at assessing the probability for 
the system to yield perfect information. The second direction, which is called restrictive, 
evaluates how opaque the system is: how robust is the security? The goal here is to measure 
how reliable is the information gained through observation. This yields up to four notions 
of quantitative opacity, displayed in Table 1, which are formally defined in this paper. The 
choice made when defining these measures was that a value should be meaningful for 
opacity in the possibilistic sense. As a result, liberal measures are when the system is 
opaque and restrictive ones are when the system is not. 

Moreover, like opacity itself, all these measures can be instantiated into several prob- 
abilistic security properties such as probabilistic non-interference and anonymity. We also 
show how to compute these values in some regular cases and apply the method to the dining 
cryptographers problem and the crowd protocols, re-confirming in passing the correctness 
result of Reiter and Rubin [8] . 

Although the measures are defined in systems without nondeterminism, they can be 
extended to the case of systems scheduled by an adversary. We show that non-memoryless 
schedulers are requested in order to reach optimum opacity measures. 

Related Work. Quantitative measures for security properties were first advocated in [9] 
and [10]. In [9], Millen makes an important step by relating the non-interference property 
with the notion of mutual information from information theory in the context of a system 
modeled by a deterministic state machine. He proves that the system satisfies the non- 
interference property if and only if the mutual information between the high-level input 
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Table 1. The four probabilistic opacity measures. 
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random variable and the output random variable is zero. He also proposes mutual informa- 
tion as a measure for information flow by showing how information flow can be seen as a noisy 
probabilistic channel, but he does not show how to compute this measure. In [10] Wittbold 
and Johnson introduce nondeducibility on strategies in the context of a non-deterministic 
state machine. A system satisfies nondeducibility on strategies if the observer cannot deduce 
information from the observation by any collusion with a secret user and using any adaptive 
strategies. They observe that if such a system is run multiple times with feedback between 
runs, information can be leaked by coding schemes across multiple runs. In this case, they 
show that a discrete memoryless channel can be built by associating a distribution with the 
noise process. From then on, numerous studies were devoted to the computation of (covert) 
channel capacity in various cases (see e.g. [11]) or more generally information leakage. 

In [12], several measures of information leakage extending these seminal works for de- 
terministic or probabilistic programs with probabilistic input are discussed. These measures 
quantify the information concerning the input gained by a passive attacker observing the 
output. Exhibiting programs for which the value of entropy is not meaningful, Smith pro- 
poses to consider instead the notions of vulnerability and min-entropy to take in account the 
fact that some execution could leak a sufficiently large amount of information to allow the 
environment to guess the remaining secret. As discussed in Section 6, probabilistic opacity 
takes this in account. 

In [13], in order to quantify anonymity, the authors propose to model the system (then 
called Information Hiding System) as a noisy channel in the sense of Information Theory: 
The secret information is modeled by the inputs, the observable information is modeled by 
the outputs and the two set are related by a conditional probability matrix. In this context, 
probabilistic information leakage is very naturally specified in terms of mutual information 
and capacity. A whole hierarchy of probabilistic notions of anonymity have been defined. 
The approach was completed in [14] where anonymity is computed using regular expressions. 
More recently, in [15], the authors consider Interactive Information Hiding Systems that can 
be viewed as channels with memory and feedback. 

In [16], the authors analyze the asymptotic behaviour of attacker's error probability and 
information leakage in Information Hiding Systems in the context of an attacker having 
the capabilities to make exactly one guess after observing n independent executions of the 
system while the secret information remains invariant. Two cases are studied: the case in 
which each execution gives rise to a single observation and the case in which each state of an 
execution gives rise to an observation in the context of Hidden Markov Models. The relation 
of these sophisticated models of attacker with our attacker model is still to clarify. Similar 
models were also studied in [17], where the authors define an ordering w.r.t. probabilistic 
non-interference. 

For systems modeled by process algebras, pioneering work was presented in [18,19], with 
channel capacity defined by counting behaviors in discrete time (non-probabilistic) CSP [18], 
or various probabilistic extensions of noninterference [19] in a generative-reactive process al- 
gebra. Subsequent studies in this area by [20,21,22] also provide quantitative measures of 
information leak, relating these measures with noninterference and secrecy. In [20], the au- 
thors introduce various notions of noninterference in a Markovian process calculus extended 
with prioritized/probabilistic zero duration actions and untimed actions. In [21] the author 
introduces two notions of information leakage in the (non-probabilistic) 7r-calculus differ- 
ing essentially in the assumptions made on the power of the attacker. The first one, called 
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absolute leakage, corresponds to the average amount of information that was leaked to the 
attacker by the program in the context of an attacker with unlimited computational re- 
sources is defined in terms of conditional mutual information and follows the earlier results 
of Millcn [9]. The second notion, called leakage rate, corresponds to the maximal number 
of bits of information that could be obtained per experiment in the context in which the 
attacker can only perform a fixed number of tries, each yielding a binary outcome repre- 
senting success or failure. Boreale also studies the relation between both notions of leakage 
and proves that they are consistent. The author also investigates compositionality of leak- 
age. Boreale et al. [22] propose a very general framework for reasoning about information 
leakage in a sequential process calculus over a semiring with some appealing applications to 
information leakage analysis when instantiating and interpreting the semiring. It appears 
to be a promising scheme for specifying and analysing regular quantitative information flow 
like we do in Section 5. 

Although the literature on quantifying information leakage or channel capacity is dense, 
few works actually tried to extend general opacity to a probabilistic setting. A notion of 
probabilistic opacity is defined in [23] , but restricted to properties whose satisfaction depends 
only on the initial state of the run. The opacity there corresponds to the probability for an 
observer to guess from the observation whether the predicate holds for the run. In that sense 
our restrictive opacity (Section 4) is close to that notion. However, the definition of [23] 
lacks clear ties with the possibilistic notion of opacity. Probabilistic opacity is somewhat 
related to the notion of view presented in [24] as authors include, like we do, a predicate 
to their probabilistic model and observation function but probabilistic opacity can hardly 
be compared with view. Indeed, on one hand, although their setting is different (they work 
on Information Hiding Systems extended with a view), our predicates over runs could be 
viewed as a generalization of the predicates over a finite set of states (properties). On the 
other hand, a view in their setting is an arbitrary partition of the state space, whereas we 
partition the runs into only two equivalence classes (corresponding to true and false). 

Organization of the paper. In Section 2, we recall the definitions of opacity and the 
probabilistic framework used throughout the paper. Section 3 and 4 present respectively the 
liberal and the restrictive version of probabilistic opacity, both for the asymmetrical and 
symmetrical case. We present in Section 5 how to compute these measures automatically if 
the predicate and observations are regular. Section 6 compares the different measures and 
what they allow to detect about the security of the system, through abstract examples and 
a case study of the Crowds protocol. In Section 7, we present the framework of probabilistic 
systems dealing with nondeterminism, and open problems that arise in this setting. 

2 Preliminaries 

In this section, we recall the notions of opacity, entropy, and probabilistic automata. 
2.1 Possibilistic opacity 

The original definition of opacity was given in [4] for transition systems. 
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Recall that a transition system is a tuple A — (£, Q, A, I) where £ is a set of actions, Q 
is a set of states, AC Q x 2J xQ is & set of transitions and I C Q is a subset of initial states. 
A run in A is a finite sequence of transitions written as: p — qq — — ^ (71 — ^ Q2 ' ' ' — ^ 9n* 
For such a run, fst(p) (resp. lst(p)) denotes qo (resp. q n ). We will also write p ■ p' for the run 
obtained by concatenating runs p and p' whenever lst(p) = fst(p'). The set of runs starting 
in state q is denoted by Run q {A) and Run(A) denotes the set of runs starting from some 
initial state: Run(A) = {J qeI Run q (A). 

Opacity qualifies a predicate tp, given as a subset of Run(A) (or equivalently as its 
characteristic function with respect to an observation function O from Run(A) onto a 
(possibly infinite) set Obs of observables. Two runs p and p' are equivalent w.r.t. O if they 
produce the same observable: O(p) = O(p'). The set C _1 (o) is called an observation class. 
We sometimes write [p]o for 0~ 1 (0(p)). 

A predicate tp is opaque on A for O if for every run p satisfying ip, there is a run p' not 
satisfying tp equivalent to p. 

Definition 1 (Opacity). Let A be a transition system andO : Run(A) — > Obs a surjective 
function called observation. A predicate ip C Run(A) is opaque on A for O if, for any 
o G Obs, the following holds: 

However, detecting whether an event did not occur can give as much information as the 
detection that the same event did occur. In addition, as argued in [6], the asymmetry of this 
definition makes it impossible to use with refinement: opacity would not be ensured in a 
system derived from a secure one in a refinement-driven engineering process. More precisely, 
if A 1 refines A and a property tp is opaque on A (w.r.t O), tp is not guaranteed to be opaque 
on A' (w.r.t O). 

Hence we use the symmetric notion of opacity, where a predicate is symmetrically opaque 
if it is opaque as well as its negation. More precisely: 

Definition 2 (Symmetrical opacity). A predicate tp C Run(A) is symmetrically opaque 
on system A for observation function O if, for any o G Obs, the following holds: 

0~ x {o)%tp and O^^^Tp. 

The symmetrical opacity is a stronger security requirement. Security goals can be ex- 
pressed as either symmetrical or asymmetrical opacity, depending on the property at stake. 

For example non-interference and anonymity can be expressed by opacity properties. 
Non-interference states that an observer cannot know whether an action h of high-level 
accreditation occurred only by looking at the actions with low-level of accreditation in the 
set L. So non-interference is equivalent to the opacity of predicate pni, which is true when 
h occurred in the run, with respect to the observation function Ol that projects the trace 
of a run onto the letters of L; see Section 3.2 for a full example. We refer to [4] and [25] for 
other examples of properties using opacity. 

When the predicate breaks the symmetry of a model, the asymmetric definition is usually 
more suited. Symmetrical opacity is however used when knowing tp or Tp is equivalent from a 
security point of view. For example, a noisy channel with binary input can be seen as a system 
A on which the input is the truth value of tp and the output is the observation o € Obs. If 
tp is symmetrically opaque on A with respect to O, then this channel is not perfect: there 
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would always be a possibility of erroneous transmission. The ties between channels and 
probabilistic transition systems are studied in [14] (see discussion in Section 4.2). 

2.2 Probabilities and information theory 

Recall that, for a countable set J7, a discrete distribution (or distribution for short) is 
a mapping \i : Q — > [0,1] such that XLefi A 4 ^) = 1- For any subset E of Q, ^(E) = 
J2uige mO^O- The set of all discrete distributions on Q is denoted by T>{£2). A discrete random 
variable with values in a set r is a mapping Z : J? — > F where [Z = z] denotes the event 
{uj£f2\ Z(lo) =z). 

The entropy of Z is a measure of the uncertainty or dually, information about Z , defined 
by the expected value of log(/x(Z)): 

H(Z) = -J2^Z = z)-log(fi(Z = z)) 

z 

where log is the base 2 logarithm. 

For two random variables Z and Z' on J7, the conditional entropy of Z given the event 
[Z' = z'\ such that fi(Z' = z') ^ is defined by: 

H(Z\Z' = z') = -J2 0*( z = Z \ Z ' = z ') ■ l °s(K z = A z ' = 2'))) 

z 

where „(Z = z\Z> = z>) = ^0^ . 

The conditional entropy of Z given the random variable Z' can be interpreted as the 
average entropy of Z that remains after the observation of Z'. It is defined by: 

H{Z\Z') = y £ J K Z ' = z ') ■ H(Z\Z' = z') 

z' 

The vulnerability of a random variable Z ', defined by V(Z) = max z [i(Z = z) gives the 
probability of the likeliest event of a random variable. Vulnerability evaluates the probability 
of a correct guess in one attempt and can also be used as a measure of information by defining 
min-entropy and conditional min-entropy (see discussions in [12,14]). 

2.3 Probabilistic models 

In this work, systems are modeled using probabilistic automata behaving as finite automata 
where non-deterministic choices for the next action and state or termination are randomized: 
this is why they are called " fully probabilistic". We follow the model definition of [26], 
which advocates for the use of this special termination action (denoted here by \J instead 
of S there). However, the difference lies in the model semantics: we consider only finite 
runs, which involves a modified definition for the (discrete) probability on the set of runs. 
Extensions to the non-deterministic setting are discussed in Section 7. 

Recall that a finite automaton (FA) is a tuple A = (S, Q, A, I, F) where (S, Q, A, I) is a 
finite transition system and F C Q is a subset of final states. The automaton is deterministic 
if 7 is a singleton and for all q £ Q and a £ S, the set {q' | (q,a,q') £ A} is a singleton. 
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Runs in A, Run q (A) and Run{A) are defined like in a transition system. A run of an FA is 
accepting if it ends in a state of F . The trace of a run p = qo qi • • • q n is the word 
tr(p) = a± ■ ■ ■ a n € S* . The language of A, written C(A), is the set of traces of accepting 
runs starting in some initial state. 

Replacing in a FA non-deterministic choices by choices based on a discrete distribution 
and considering only finite runs result in a fully probabilistic finite automaton (FPFA). 
Consistently with the standard notion of substochastic matrices, we also consider a more 
general class of automata, substochastic automata (SA), which allow us to describe subsets 
of behaviors from FPFAs, see Fig. 1 for examples. In both models, no non-determinism 
remains, thus the system is to be considered as autonomous: its behaviors do not depend 
on an exterior probabilistic agent acting as a scheduler for non-deterministic choices. 

Definition 3 (Substochastic automaton). Let y/ be a new symbol representing a ter- 
mination action. A substochastic automaton (SA) is a tuple (S,Q,A,qo) where £ is a 
finite set of actions, Q is a finite set of states, with qo G Q the initial state and A : Q — > 
((£ x Q) tt) {i/} — > [0, 1]) is a mapping such that for any q G Q, 

E A ^)(*) < i 

x£(£xQ)wW} 

A defines sub stochastically the action and successor from the current state, or the termina- 
tion action y/. 

In SA, we write q — > /i for A(q) = p and q A- r whenever q — > p and p(a, r) > 0. We also 
write q ■ y/ whenever q — > fi and > 0. In the latter case, q is said to be a final state. 

Definition 4 (Fully probabilistic finite automaton). A fully probabilistic automaton 
(FPFA) is a particular case of SA where for all q G Q, A(q) = ji is a distribution in 
V((Sx Q)W{V}) i.e. 

E A b)W = 1 

ie(£xQ)B{4 

and for any state q G Q there exists a path (with non-zero probability) from q to some final 
state. 

Note that we only target finite runs and therefore we consider a restricted case, where any 
infinite path has probability 0. 

Since FPFA is a subclass of SA, we overload the metavariable A for both SA and FPFA. 
The notation above allows to define a run for an SA like in a transition system as a finite 
sequence of transitions written p = q® — ^> <?i — ^> <72 • ■ • <?n- The sets Run q (A) and 
Run(A) are defined like in a transition system. A complete run is a (finite) sequence denoted 
by p-\J where p is a run and A(ht(p))(\/) > 0. The set CRun(A) denotes the set of complete 
runs starting from the initial state. In this work, we consider only such complete runs. 

The trace of a run for an SA A is defined like in finite automata. The language of a 
substochastic automaton A, written C{A), is the set of traces of complete runs starting in 
the initial state. 

For an SA A, a mapping P_4 into [0, 1] can be defined inductively on the set of complete 
runs by: 

P.A(g\/) = KV) and p A(q A p) = fi{a, r) ■ P A {p) 
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where q — > /x and fst(p) = r. 

The mapping is then a discrete distribution on CRun(A). Indeed, the action can 
be seen as a transition label towards a new sink state q^j. Then, abstracting from the labels 
yields a finite Markov chain, where q^ is the only absorbing state and the coefficients of the 
transition matrix are M q q / = ^2 aeS A(q)(a, q'). The probability for a complete run to have 
length n is then the probability p n to reach q^j in exactly n steps. Therefore, the probability 
of all finite complete runs is V{CRun(A)) — ^2, n p n an d a classical result [27] on absorbing 
chains ensures that this probability is equal to 1. 

Since the probability space is not generated by (prefix-closed) cones, this definition does 
not yield the same probability measure as the one from [26]. Since opacity properties are 
not necessarily prefix-closed, this definition is consistent with our approach. 

When A is clear from the context, will simply be written P. 

Since P^ is a (sub-)probability on CRun(A), for any predicate if C C'Run(A), we 
have P(<p) = S P e v P(p)- The measure is extended to languages K C C{A) by P(K) = 

P(tr- 1 (^))=Etr(p )e KP(p)- 

In the examples of Fig. 1, restricting the complete runs of Ai to those satisfying ip 
[fi | tr(p) e a*} yields the SA A 2 , and PaA^) = ~P a- 2 { c Run {A 2 )) = \- 

A non probabilistic version of any SA is obtained by forgetting any information about 
probabilities. 

Definition 5. Let A = (S,Q,A,qQ) be an SA. The (non-deterministic) finite automaton 
unProb(A) = (£, Q, A', qg, F) is defined by: 

- A' = {(q, a,r) G Q x S x Q \q-t n, fj,(a, r) > 0}, 

— F = {q G Q | q — > fi, fi(-\/) > 0} is the set of final states. 

It is easily seen that C{unProb(A)) = C(A). 

An observation function O : CRun(A) — > Obs can also be easily translated from the 
probabilistic to the non probabilistic setting. For A' = unProb(A), we define unProb(0) 
on Run(A') by unProb(O)(q q-y ■ ■ ■ q n ) = O{q qi ■ ■ ■ q»y/). 




(a) FPFA Ai (b) SA Ai 



Fig. 1. A% is the restriction of A\ to a*. 
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3 Measuring non-opacity 
3.1 Definition and properties 

One of the aspects in which the definition of opacity could be extended to probabilistic 
automata is by relaxing the universal quantifiers of Definitions 1 and 2. Instead of wanting 
that every observation class should not be included in tp (resp. tp or Tp for the symmetrical 
case) , we can just require that almost all of them do. To obtain this, we give a measure for the 
set of runs leaking information. To express properties of probabilistic opacity in an FPA A, 
the observation function O is considered as a random variable, as well as the characteristic 
function 1„ of tp. Both the asymmetrical and the symmetrical notions of opacity can be 
generalized in this manner. 

Definition 6 (Liberal probabilistic opacity). The liberal probabilistic opacity or LPO 
of predicate tp on FPA A, with respect to (surjective) observation function O : CRun — » Obs 
is defined by: 

POf(A,tp,0) = Yl P(0 = o). 

o£Obs 

The liberal probabilistic symmetrical opacity or LPSO is defined by: 

P0 s t {A,tp,O) = POf(A,tp,0) + POf{A,tp,0) 

= ]T P(0 = o)+ Yl P(0 = o). 

oeObs oeObs 

This definition provides a measure of how insecure the system is. The following propo- 
sitions shows that a null value for these measures coincides with (symmetrical) opacity for 
the system, which is then secure. 

For LPO, it corresponds to classes either overlapping both tp and tp or included in 'tp as in 
Fig. 2(a). LPO measures only the classes that leak their inclusion in tp. So classes included 
in Tp are not taken into account. On the other extremal point, POf{A, tp, O) — 1 when tp is 
always true. 

When LPSO is null, it means that each equivalence class _1 (o) overlaps both tp and tp 
as in Fig. 2(c). On the other hand, the system is totally insecure when, observing through 
O, we have all information about tp. In that case, the predicate tp is a union of equivalence 
classes C _1 (o) as in Fig. 2(e) and this can be interpreted in terms of conditional entropy 
relatively to O. The intermediate case occurs when some, but not all, observation classes 
contain only runs satisfying tp or only runs not satisfying tp, as in Fig. 2(d). 

Proposition 1. 

(1) < POf{A,tp,0) < 1 and < POf(A,tp,0) < 1 

(2) PO?{A, tp,O) = if and only if tp is opaque on unProb(A) with respect to unProb(O). 
POf{A, tp, O) = if and only if tp is symmetrically opaque on unProb(A) with respect 
to unProb(0). 

(3) POf{A, tp,0) = l if and only if tp = CRun{A). 
P0f(A,tp,O) = 1 if and only if H{1 V \0) =0. 
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Proof (Proof of Proposition 1.). 

(1) The considered events are mutually exclusive, hence the sum of their probabilities never 
exceeds 1. 

(2) First observe that a complete run r^a . . . r n \J has a non null probability in A iff r^a . . . r n 
is a run in unProb(A). Suppose POf(A, ip, O) = 0. Recall that O is assumed surjective. 
Then there is no observable o such that C~ 1 (o) C tp. Conversely, if <p is opaque on 
unProb(A), there is no observable o G Obs such that -1 (o) C ip, hence the null value 
for POf(A, tp,0) . The case of LPSO is similar, also taking into account the dual case 
of Tp in the above. 

(3) For LPO, this is straightforward from the definition. For LPSO, H(1 V \0) = iff 

P(l v = ip = o) ■ log(P(l y = i\0 = o)) = 

o£Obs 

ie{o,i} 

Since all the terms have the same sign, this sum is null if and only if each of its term is 
null. Setting for every o € Obs, f(o) = P(l y = l\(D = o) = 1 - P(l v = 0\O = o), we 
have: H(1 V \0) = iff Vo e Obs, f{o) ■ log(/(o)) + (1 - f(o)) ■ log(l - f{o)) = 0. Since 
the equation x ■ log(x) + (1 — x) ■ log(l — x) = only accepts 1 and as solutions, it 
means that for every observable o, either all the runs p such that O(p) = o are in tp, or 
they are all not in ip. Therefore H(1 V \0) = iff for every observable o, C~ 1 (o) C <p or 
C _1 (o) C Tp, which is equivalent to POf (A, <p,0) = l. 
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Fig. 3. Interferent FPAs A3 and A 4 . 
3.2 Example: Non-interference 

For the systems A3 and A4 of Fig. 3, we use the predicate ifiNi which is true if the trace of a 
run contains the letter h. In both cases the observation function Ol returns the projection of 
the trace onto the alphabet {£\, £2}- Remark that this example is an interference property [3] 
seen as opacity. Considered unprobabilistically, both systems are interferent since an £2 not 
preceded by an i\ betrays the presence of an h. However, they differ by how often this case 
happens. 

The runs of A3 and A 4 and their properties are displayed in Table 2. Then we can see 
that [pije^ = [p2]o L overlaps both ip^j and <Pni, while [/c^o^ is contained totally in <p. 
Hence the LPO can be computed for both systems: 

POf(A 3 , <PNI,0 L ) = \ POf{M, <PNI, L ) = 

Therefore A3 is more secure than A 4 . Indeed, the run that is interferent occurs more often 
in A4, leaking information more often. 

Note that in this example, LPO and LPSO coincide. This is not always the case. Indeed, 
in the unprobabilistic setting, both symmetrical and asymmetrical opacity of ip^i with 
respect to Ol express the intuitive notion that "an external observer does not know whether 
an action happened or not" . The asymmetrical notion corresponds to the definition of strong 
nondeterministic non-interference in [3] while the symmetrical one was defined as perfect 
security property in [6]. 
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Table 2. Runs of A3 and A4. 
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4 Measuring the robustness of opacity 

The completely opposite direction that can be taken to define a probabilistic version is a 
more paranoid one: how much information is leaked through the system's uncertainty? For 
example, on Fig. 2(c), even though each observation class contains a run in tp and one in Tp, 
some classes are nearly in tp. In some other classes the balance between the runs satisfying 
tp and the ones not satisfying (p is more even. Hence for each observation class, we will not 
ask if it is included in tp, but how likely tp is to be true inside this class with a probabilistic 
measure taking into account the likelihood of classes. This amounts to measuring, inside 
each observation class, Tp in the case of asymmetrical opacity, and the balance between tp 
and Tp in the case of symmetrical opacity. Note that these new measures are relevant only 
for opaque systems, where the previous liberal measures are equal to zero. 

In [1], another measure was proposed, based on the notion of mutual information (from 
information theory, along similar lines as in [12]). However, this measure had a weaker 
link with possibilistic opacity (see discussion in Section 6). What we call here RPO is a 
new measure, whose relation with possibilistic opacity is expressed by the second item in 
Proposition 2. 

4.1 Restricting Asymmetrical Opacity 

In this section we extend the notion of asymmetrical opacity in order to measure how secure 
the system is. 

Definition and properties. In this case, an observation class is more secure if tp is less 
likely to be true. That means that it is easy (as in "more likely" ) to find a run not in tp in 
the same observation class. Dually, a high probability for tp inside a class means that few 
(again probabilistically speaking) runs will be in the same class yet not in tp. 

Restrictive probabilistic opacity is defined to measure this effect globally on all observa- 
tion classes. It is tailored to fit the definition of opacity in the classical sense: indeed, if one 
class totally leaks its presence in tp, RPO will detect it (second point in Proposition 2). 

Definition 7. Let tp be a predicate on the complete runs of an FPA A and O an observation 
function. The restrictive probabilistic opacity (RPO) of tp on A, with respect to O , is defined 



RPO is the harmonic means (weighted by the probabilities of observations) of the prob- 
ability that tp is false in a given observation class. The harmonic means averages the leakage 
of information inside each class. Since security and robustness are often evaluated on the 
weakest link, more weight is given to observation classes with the higher leakage, i.e. those 
with probability of tp being false closest to 0. 

The following proposition gives properties of RPO. 

Proposition 2. 



by 




(1) 0< PO?(A,tp,(D) < 1 
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(2) PO^(A, ip, O) = if and only ifip is not opaque on unProb{A) with respect to unProb(0) 

(3) PC£(A, ip,0) = l if and only ifp = %. 

Proof. The first point immediately results from the fact that RPO is a means of values 
between and 1. 

From the definition above, RPO is null if and only if there is one class that is contained 
in <p. Indeed, this corresponds to the case where the value of _q|q_ ) goes to +oo, for 
some o, as well as the sum. 

Thirdly, if <p is always false, then RPO is 1 since it is a means of probabilities all of value 
1. Conversely, if RPO is 1, because it is defined as an average of values between and 1, 
then all these values must be equal to 1, hence for each o, P(l v = | O = 6) = 1 which 
means that P(l ¥ , = 0) = 1 and ip is false. 

Example: Debit Card System. Consider a Debit Card system in a store. When a card 
is inserted, an amount of money x to be debited is entered, and the client enters his PIN 
number (all this being gathered as the action Buy (a:)). The amount of the transaction is 
given probabilistically as an abstraction of the statistics of such transactions. Provided the 
PIN is correct, the system can either directly allow the transaction, or interrogate the client's 
bank for solvency. In order to balance the cost associated with this verification (bandwidth, 
server computation, etc.) with the loss induced if an insolvent client was debited, the de- 
cision to interrogate the bank's servers is taken probabilistically according to the amount 
of the transaction. When interrogated, the bank can reject the transaction with a certain 
probability 4 or accept it. This system is represented by the FPA .A ca rd of Fig. 4. 




Fig. 4. The Debit Card system A ca id- 



4 Although the bank process to allow or forbid the transaction is deterministic, the statistics of 
the result can be abstracted into probabilities. 
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Now assume an external observer can only observe if there has been a call or not to the 
bank server. In practice, this can be achieved, for example, by measuring the time taken for 
the transaction to be accepted (it takes longer when the bank is called), or by spying on the 
telephone line linking the store to the bank's servers (detecting activity on the network or 
idleness) . Suppose what the external observer wants to know is whether the transaction was 
worth more than 500€. By using RPO, one can assess how this knowledge can be derived 
from observation. 

Formally, in this case the observables are {e, Call}, the observation function ©call being 
the projection on {Call}. The predicate to be hidden to the user is represented by the regular 
expression tp >500 = E*("x > 1000" + "500 < x < 1000" )E* (where E is the whole alphabet). 
By definition of RPO: 

—4- 1 = P(Ocaii = e) 1 



PO?(Awd,V>6oo,0c»u) ' P(-v>5oo|Ccaii = e) 

+ P(0 Ca u = Call) ' 



|Ocaii = Call) 

Computing successively P(0 C aii = e), P(^>50o|Ocaii = e), P(£?CaU = Call), and 
P(^V 5 >50o|C , Caii = Call) (see Appendix A), we obtain: 

28272 

PO?(A«d,¥>>6OO,0Caii) = 3^ - 0.718. 



The notion of asymmetrical opacity, however, fails to capture security in terms of opacity 
for both ip and Tp. And so does the RPO measure. Therefore we define in the next section a 
quantitative version of symmetrical opacity. 

4.2 Restricting symmetrical opacity 

Symmetrical opacity offers a sound framework to analyze the secret of a binary value. For 
example, consider a binary channel with n outputs. It can be modeled by a tree-like system 
branching on and 1 at the first level, then branching on observables {oi, ... ,o n }, as in 
Fig. 5. If the system wishes to prevent communication, the secret of predicate "the input of 
the channel was 1" is as important as the secret of its negation; in this case "the input of 
the channel was 0". Such case is an example of initial opacity [4], since the secret appears 
only at the start of each run. Note that any system with initial opacity and a finite set of 
observables can be transformed into a channel [14], with input distribution (p, 1 — p), which 
is the distribution of the secret predicate over {ip,Tp}. 

Definition and properties. Symmetrical opacity ensures that for each observation class 
o (reached by a run), the probability of both P((p \ o) and P(Tp | o) is strictly above 0. 
That means that the lower of these probabilities should be above 0. In turn, the lowest of 
these probability is exactly the complement of the vulnerability (since l v can take only two 
values). That is, the security is measured with the probability of error in one guess (inside 
a given observation class). Hence, a system will be secure if, in each observation class, tp is 
balanced with Tp. 
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(a) A system with initial secret 



(b) Channel with binary in- 
put 



Fig. 5. A system and its associated channel. 



Definition 8 (Restrictive probabilistic symmetric opacity). Let ip be a predicate on 
the complete runs of an FPA A and O an observation function. The restrictive probabilistic 
symmetric opacity (RPSO) of ip on A, with respect to O, is defined by 



P0 s r (A,p,O) 



Eoeo bs P(0 = 
where V(l v | O = o) = maxj e { ,i} P(lp = i\ O = 6) 



-1 

log(T 



V(l v \0 = o)) 



Remark that the definition of RPSO has very few ties with the definition of RPO. Indeed, 
it is linked more with the notion of possibilistic symmetrical opacity than with the notion 
of quantitative asymmetrical opacity, and thus RPSO is not to be seen as an extension of 
RPO. 

In the definition of RPSO, taking — log(l — V(l v | O = o)) allows to give more weight 
to very imbalanced classes, up to infinity for classes completely included either in ip or in Xp. 
Along the lines of [12], the logarithm is used in order to produce a measure in terms of bits 
instead of probabilities. These measures are then averaged with respect to the probability 
of each observation class. The final inversion ensures that the value is between and 1, and 
can be seen as a normalization operation. The above motivations for the definition of RPSO 
directly yield the following properties: 

Proposition 3. (1) < P0^(A,p,O) < 1 

(2) P0r{A,<p,O) = if and only if ip is not symmetrically opaque on unProb(A) with 
respect to unProbiO). 

(3) POr(A, tp,G) = l if and only i/Vo G 06s, P(l„ = 1 | O = o) = |. 



Proof (Proof of Proposition 3.). 

(1) Since the vulnerability of a random variable that takes only two values is between | and 
1, we have 1 - V(l v | O = o) G [0, |] for all o G Obs. So - log(l - V(l v | O = o)) G 
[1, +oo[ for any o. The (arithmetic) means of these values is thus contained within the 
same bounds. The inversion therefore yields a value between and 1. 
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(2) If ip is not symmetrically opaque, then for some observation class o, O (o) C ip or 
O~ x {o) C Tp. In both cases, V(l v \ O) = 1, so - log(l - | (9 = o)) = +oo and the 
average is also +oo. Taking the limit for the inverse gives the value for RPSO. 
Conversely, if RPSO is 0, then its inverse is +oo, which can only occur if one of the 
— log(l— V(lip | O = o)) is +oo for some o. This, in turn, means that some V(\ v \ = 6) 
is 1, which means the observation class of o is contained either in tp or in Tp. 

(3) POf (A, <p,0) = l iff Eoeobs P(0 = o) ■ (- log (1 - V(l v \ O = o))) = 1. Since this is 
an average of values above 1, this is equivalent to —log (1 — V(l v \ O = o)) = 1 for 
all o s Obs, i.e. V(l v \ O = o) = h for all o. In this particular case, we also have 
V(l v | O = 6) = | iff P(l v = 1 | = o) = | which concludes the proof. 



Example 1: Sale protocol. We consider the sale protocol from [15], depicted in Fig. 6. 
Two products can be put on sale, either a cheap or an expensive one, and two clients, either 
a rich or a poor one, may want to buy it. The products are put on sale according to a 
distribution (a and a — 1 — a) while buyers behave probabilistically (through /? and 7) 
although differently according to the price of the item on sale. The price of the item is 




Fig. 6. A simple sale protocol represented as an FPA Sale. 



public, but the identity of the buyer should remain secret. Hence the observation function 
Price yields cheap or expensive, while the secret is, without loss of symmetry, the set (/'poor 
of runs ending with poor. The bias introduced by the preference of, say, a cheap item by 
the poor client betrays the secret identity of the buyer. RPSO allows to measure this bias, 
and more importantly, to compare the bias obtained globally for different values of the 
parameters a, j3, and 7. 
More formally, we have: 

P (Price = cheap) = a P (Price = expensive) = a 

^(i^poor I = cheap) = max(/3,/3) V(l Vpool . \ O = expensive) = max(7,7) 



POf (Sale, (jfpoor , Price) 



-1 



a ■ log(min(/3, /3)) + a ■ log(min(7, 7)) 



The variations of RPSO w.r.t. to f3 and 7 is depicted for several values of a in Fig. 7, red 
meaning higher value for RPSO. Thus, while the result is symmetric for a = ^, the case 
where a — I gives more importance to the fluctuations of 7. 
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Example 2: Dining Cryptographers Protocol. Introduced in [7], this problem involves 
three cryptographers C±, Ci and C3 dining in a restaurant. At the end of the meal, their 
master secretly tells each of them if they should be paying: pi = 1 iff cryptographer C; pays, 
and pi = otherwise. Wanting to know if one of the cryptographers paid or if the master 
did, they follow the following protocol. They flip a coin with each of their neighbor, the third 
one not seeing the result of the flip, marking fij — if the coin hip between i and j was 
heads and foj = 1 if it was tails. Then each cryptographer Ci, for i E {1,2,3}, announces 
the value of r.; = /j^+i © fi,i-i © Pi (where '3 + 1 = 1', '1 — 1 = 3' and '©' represents the 
XOR operator). If Ti — then no one (i.e. the master) paid, if © i=1 r» = 1, then one 

of the cryptographers paid, but the other two do not know who he is. 

Here we will use a simplified version of this problem to limit the size of the model. We 
consider that some cryptographer paid for the meal, and adopt the point of view of C\ who 
did not pay. The anonymity of the payer is preserved if C'± cannot know if C2 or C3 paid 
for the meal. In our setting, the predicate ip2 is, without loss of symmetry, "C2 paid". Note 
that predicate if2 is well suited for analysis of symmetrical opacity, since detecting that y>2 
is false gives information on who paid (here C3). The observation function lets C\ know the 
results of its coin flips (/i,2 and /i,3), and the results announced by the other cryptographers 
(r2 and ra). We also assume that the coin used by C2 and C3 has a probability of q to yield 
heads, and that the master flips a fair coin to decide if C2 or C3 pays. It can be assumed 
that the coins C\ flips with its neighbors are fair, since it does not affect anonymity from 
Ci's point of view. In order to limit the (irrelevant) interleaving, we have made the choice 
to fix the ordering between the coin flips. 

The corresponding FPA T> is depicted on Fig. 8 where all \J transitions with probability 
1 have been omitted from final (rectangular) states. On V, the runs satisfying predicate ^2 
are the ones where action P2 appears. The observation function 0\ takes a run and returns 
the sequence of actions over the alphabet {/ii,2, *i,2, ^i,3j ^1.3} an d the final state reached, 
containing the value announced by C2 and C3. 
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Fig. 8. The FPA corresponding to the Dining Cryptographers protocol. 



There are 16 possible complete runs in this system, that yield 8 equiprobable observables: 

Obs = {(h h2 h h3 (r 2 = 1, r 8 = 0)), (h ll2 h 1<a (r 2 = 0, r 3 = 1)), 
(fti,2*i,3(r2 = 0, r 3 = 0)), (/ii,2*i,3(»*2 = 1, r 3 = 1)), 
(ti, 2 /ii,3(r2 = 0, r 3 = 0)), (ti >2 *H,3(ra = 1, r 3 = 1)), 
(*i,2*i,3(r 2 = 1, r 3 = 0)), (*i, 2 *i, 3 (r2 = 0, r 3 = 1)) } 

Moreover, each observation results in a run in which C 2 pays and a run in which C 3 pays, 
this difference being masked by the secret coin flip between them. For example, runs p^ — 
hi. 2 hi, 3 h 2t3 p 2 (r 2 = l,r 3 = 0) and p t = hi, 2^1,3*2,3^3 ( r 2 = 1,T3 = 0) yield the same 
observable oq = hi, 2 hi, 3(^2 = 1, r 3 = 0), but the predicate is true in the first case and false in 
the second one. Therefore, if < q < 1, the unprobabilistic version of T> is opaque. However, 
if q 7^ 4, for each observable, one of them is more likely to be lying, therefore paying. In the 
aforementioned example, when observing o , ph has occurred with probability q, whereas p t 
has occurred with probability 1 — q. RPSO can measure this advantage globally. 

For each observation class, the vulnerability of <p% is max(g, 1 — q). Hence the RPSO will 

be 

po s t {v, 92 ,Oi) = 7 1 ^ 

log(mm( g , 1 - q)) 

The variations of the RPSO when changing the bias on q are depicted in Fig. 9. Analysis of 
RPSO according to the variation of q yields that the system is perfectly secure if there is 
no bias on the coin, and insecure if q = or q = 1. 



5 Computing opacity measures 

We now show how all measures defined above can be computed for regular predicates and 
simple observation functions. The method relies on a synchronized product between an SA 
A and a deterministic FA K., similarly to [28]. This product (which can be considered pruned 
of its unreachable states and states not reaching a final state) constrains the unprobabilistic 
version of A by synchronizing it with K. The probability of CQC) is then obtained by solving 
a system of equations associated with this product. The computation of all measures results 
in applications of this operation with several automata. 
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Fig. 9. Evolution of the restrictive probabilistic symmetric opacity of the Dining Cryptog- 
raphers protocol when changing the bias on the coin. 



5.1 Computing the probability of a substochastic automaton 

Given an SA A, a system of equations can be derived on the probabilities for each state to 
yield an accepting run. This allows to compute the probability of all complete runs of A by 
a technique similar to those used in [28,29,30] for probabilistic verification. 

Definition 9 (Linear system of a substochastic automata). Let A = (S,Q, A,qa) 
be a substochastic automaton where any state can reach a final state. The linear system 
associated with A is the following system Sa of linear equations over K: 

\ q'aQ 

where a q ^ = ^ A(q)(a,q') and (3 q = A(q)(y/) 

When non-determinism is involved, for instance in Markov Decision Processes [28,30], 
two systems of inequations are needed to compute maximal and minimal probabilities. Here, 
without non-determinism, both values are the same, hence Lemma 1 is a particular case of 
the results in [28,30], where uniqueness is ensured by the hypothesis (any state can reach a 
final state). The probability can thus be computed in polynomial time by solving the linear 
system associated with the SA. 

Lemma 1. Let A = (S,Q, A,qo) be a substochastic automaton and define for all q 6 Q, 
L^ = P(CRun q (A)). Then (L^) q ^Q is the unique solution of the system 1S4. 

5.2 Computing the probability of a regular language 

In order to compute the probability of a language inside a system, we build a substochas- 
tic automaton that corresponds to the intersection of the system and the language, then 
compute the probability as above. 

Definition 10 (Synchronized product). Let A = (S,Q,A,qo) be a substochastic au- 
tomaton and let K, = (Q x S x Q, Qk, Ak, qx, F) be a deterministic finite automaton. The 



20 



B. Berard, J. Mullins and M. Sassolas 



synchronized product A \\ JC is the substochastic automaton (S,Q X Qk, A', (go, Ik)} where 
transitions in A' are defined by: if q\ — > /j, £ A, then (q\, r\) — > v £ A 1 where for all a £ E 
and (q 2 ,r 2 ) etjx Q K , 



v(a, (q 2 ,r 2 )) 



_ I p(a, q 2 ) i] r\ > r 2 £ A K 



oth 



erwise 



and = 

v II) otherwise 



In this synchronized product, the behaviors are constrained by the finite automaton. Actions 
not allowed by the automaton are trimmed, and states can accept only if they correspond 
to a valid behavior of the DFA. Note that this product is defined on SA in order to allow 
several intersections. The correspondence between the probability of a language in a system 
and the probability of the synchronized product is laid out in the following lemma. 

Lemma 2. Let A — (£, Q, A, qo) be an SA and K a regular language over Q x S x Q 
accepted by a deterministic finite automaton K, — (Q x U x Q,Qk, ^k^IKiF) ■ Then 

P A (K) = Lf lK , 

Proof. Let p £ CRun(A) with tr(p) £ K and p = qo — ^ q\ - ■ ■ <7„^/. Since tr(p) 6 _ftT 
and /C is deterministic, there is a unique run px = qx - ^ n. • • • — ^> r n in K. with r„ € F. 
Then the sequence p' = (qo,qi<) — ^> (91, ■ • • — ^ (q n ,r n ) is a run of _4||/C. There is a 
one-to one-match between runs of A\ |/C and pairs of runs in A and JC with the same trace. 
Moreover, 

P-AllfcO') = ^'(<7o,gK)(ai,(<?i,ri)) x ••• x Z\'( g „,r„)(V) 
= A(q )(a 1 ,q 1 ) x • • • x Z\(g„)( v /) 

Hence 

p -aW = E p *oo = E p ^ii*V) = p^njc(i2«»(^ii^)) 

{/)|tr(p)eAT} p'e_R.un(^||/C) 

and therefore from Lemma 1, Va(K) = L^j^y 
5.3 Computing all opacity measures 

All measures defined previously can be computed as long as, for i £ {0, 1} and o £ Obs, all 
probabilities 

P(l v = i) P((D = o) P(l v = i,0 = o) 

can be computed. Indeed, even deciding whether _1 (o) C tp can be done by testing P(C = 
o) > 0AP(1 V = 0,0 = o) = 0. 
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Now suppose Obs is a finite set, tp and all C _1 (o) are regular sets. Then one can build 
deterministic finite automata A v , Ap, A for o € Obs that accept respectively tp, Tp, and 
0-\o). 

Synchronizing automaton A v with A and pruning it yields a substochastic automaton 
.A 1 1 .4a. By Lemma 2, the probability P(l v = 1) is then computed by solving the linear 
system associated with 4||4 V . Similarly, one obtain P(l v — 0) (with A^), P(0 = o) 
(with 4 G ), P(l v = 1,0 = o) (synchronizing A\\A V with 4 G ), and P(l v = 0,0 = o) 
(synchronizing 4||4^ with A ). 

Theorem 1. Let A be an FPA. If Obs is a finite set, tp is a regular set and for o G 
Obs, 0- l (o) is a regular set, then for PO E {POf , POf ', PC^ , PO S r } , PO(A,ip,0) can be 
computed. 

The computation of opacity measures is done in polynomial time in the size of Obs and 
DFAs Aip, Ajp, A . 

A prototype tool implementing this algorithm was developed in Java [31], yielding nu- 
merical values for measures of opacity. 



6 Comparison of the measures of opacity 

In this section we compare the discriminating power of the measures discussed above. As 
described above, the liberal measures evaluate the leak, hence represents the best possible 
value from a security point of view, producing an opaque system. For such an opaque 
system, the restrictive measure evaluate the robustness of this opacity. As a result, 1 is the 
best possible value. 

6.1 Abstract examples 

The values of these metrics are first compared for extremal cases of Fig. 10. These values 
are displayed in Table 3. 



(a) Ai 



(b) A 2 



(c) A 3 



(d) Aa 



(e) 4 5 

0-\o) 



(f) A 6 



(g) A 7 



Fig. 10. Example of repartition of probabilities of l v and O in 7 cases. 



First, the system Ai of Fig. 10(a) is intuitively very secure since, with or without obser- 
vation, an attacker has no information whether ip was true or not. This security is reflected in 
all symmetrical measures, with highest scores possibles in all cases. It is nonetheless deemed 
more insecure for RPO, since opacity is perfect when ip is always false. 
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System 


(a) Ai 


(b) A 2 


(c) -4 3 


(d) M 


(c) A 5 


(f) A 6 


(g) A 7 


LPO 











1 

4 


i 

4 


i 

4 





LPSO 











1 

4 


1 

2 


1 

2 


1 

4 


RPO 


l 

2 


3 

4 


3 
8 











12 
25 


RPSO 


1 


1 
2 


1 
2 















Table 3. Values of the different opacity measures for systems of Fig. 10(a)-(g). 



The case of A 2 of Fig. 10(b) only differs from A\ by the global repartition of ip in 
Run(A). The information an attacker gets comes not from the observation, but from ip 
itself. Therefore RPSO, which does not remove the information available before observation, 
evaluates this system as less secure than Ai. Measures based on information theory [12,1] 
would consider this system as secure. However, such measures lack strong ties with opacity, 
which depend only on the information available to the observer, wherever this information 
comes from. In addition, RPO finds Ai more secure than A\\ <p is verified less often. Note 
that the complement would not change the value for symmetrical measures, while being 
insecure for RPO (with PO^ = \). 

However, since each observation class is considered individually, RPSO does not discrim- 
inate A2 and ^3 of Fig. 10(c). Here, the information is the same in each observation class as 
for A21 but the repartition of <p gives no advantage at all to an attacker without observation. 

When the system is not opaque (resp. symmetrically opaque), RPO (resp. RPSO) cannot 
discriminate them, and LPO (resp. LPSO) becomes relevant. For example, A4 is not opaque 
for the classical definitions, therefore PO^ = POf = and both POf > and POf > 0. 

System, A5 of Fig. 10(e) has a greater POf than A4. However, LPO is unchanged since 
the class completely out of ip is not taken into account. Remark that system A7 is opaque 
but not symmetrically opaque, hence the relevant measures are POf and POr- Also note 
that once a system is not opaque, the repartition of classes that do not leak information is 
not taken into account, hence equal values in the cases of A§ and Aq. 



6.2 A more concrete example 

Consider the following programs Pi and P2, inspired from [12], where k is a given parameter, 
random select uniformly an integer value (in binary) between its two arguments and & is 
the bitwise and: 

Pi. H := random(0, 2 8fc - 1); 
if H mod 8 = then 

L:= H P 2 : H := random(0, 2 8fe - 1); 

else L:= H & 7fe l fc 

L := -1 

fi 

In both cases, the value of H, an integer over 8fc bits, is supposed to remain secret, 
and cannot be observed directly, while the value of L is public. Thus the observation is the 
"L := . . ." action. Intuitively, Pi divulges the exact value of H with probability 4. On the 
other hand, P2 leaks the value of one eighth of its bits (the least significant ones) at every 
execution. 
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These programs can be translated into FPAs Ap 1 and Ap 2 of Fig. ff. In order to 
have a boolean predicate, the secret is not the value of variable H, but whether H = L: 
if— = {(H = x)(L = x) | x € {0, . . . , 2 8k — 1}}. Non opacity then means that the attacker 
discovers the secret value. Weaker predicates can also be considered, like equality of H with 
a particular value or H belonging to a specified subset of values, but we chose the simplest 
one. First remark that ip = is not opaque on Pj in the classical sense (both symmetrically 




or not). Hence both RPO and RPSO are null. On the other hand, <p— is opaque on P 2 , 
hence LPO and LPSO are null. The values for all measures are gathered in Table 4. Note 



Program 


PO, A 


POf 


PO A 


PO, s 


Pi 


l 

8 


1 








P 2 










1 

7fc 



Table 4. Opacity measures for programs Pi and Pi. 



that only restrictive opacity for P 2 depends on k. This comes from the fact that in all other 
cases, both <p— and the equivalence classes scale at the same rate with k. In the case of P2, 
adding length to the secret variable H dilutes <p— inside each class. Hence the greater k is, 
the hardest it is for an attacker to know that ip— is true, thus to crack asymmetrical opacity. 
Indeed, it will tend to get false in most cases, thus providing an easy guess, and a low value 
for symmetrical opacity. 
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6.3 Crowds protocol 



The anonymity protocol known as crowds was introduced in [8] and recently studied in the 
probabilistic framework in [13,14]. When a user wants to send a message (or request) to a 
server without the latter knowing the origin of the message, the user routes the message 
through a crowd of n users. To do so, it selects a user randomly in the crowd (including 
himself), and sends him the message. When a user receives a message to be routed according 
to this protocol, it cither sends the message to the server with probability 1 — q or forwards 
it to a user in the crowd, with probability q. The choice of a user in the crowd is always 
equiprobable. Under these assumptions, this protocol is known to be secure, since no user 
is more likely than another to be the actual initiator; indeed its RPO is very low. However, 
there can be c corrupt users in the crowd which divulge the identity of the person that sent 
the message to them. In that case, if a user sends directly a message to a corrupt user, 
its identity is no longer protected. The goal of corrupt users is therefore not to transmit 
messages, hence they cannot initiate the protocol. The server and the corrupt users cooperate 
to discover the identity of the initiator. RPO can measure the security of this system, 
depending on n and c. 

First, consider our protocol as the system in Fig. 12. In this automaton, states 1, . . . , n—c 
corresponding to honest users are duplicated in order to differentiate their behavior as 
initiator or as the receiver of a message from the crowd. The predicate we want to be opaque 
is tpi that contains all the runs in which i is the initiator of the request. The observation 
function O returns the penultimate state of the run, i.e. the honest user that will be seen 
by the server or a corrupt user. 




Fig. 12. FPA C£ for Crowds protocol with n users, among whom c are corrupted. 



Quantifying Opacity 25 



For sake of brevity, we write l i to denote the event "a request was initiated by i" 
and f when "j was detected by the adversary", which means that j sent the message 
cither to a corrupt user or to the server, who both try to discover who the initiator was. 
The abbreviation i -w j stands for i A j. Notation ( ->i means that "a request was 
initiated by someone else than i" ; similarly, combinations of this notations are used in the 
sequel. We also use the Kronecker symbol Sij defined by dij = 1 if i = j and otherwise. 

Computation of the probabilities. All probabilities P(i — > j) can be automatically 
computed using the method described in Section 5. For example, P(l (n — c)), the 
probability for the first user to initiate the protocol while the last honest user is detected, 
can be computed from substochastic automaton C£||,Aiw n _ c ) depicted on Fig. 13. In this 
automaton, the only duplicated state remaining is 1'. 

This SA can also be represented by a transition matrix (like a Markov chain), which is 
given in Table 5. An additional column indicates the probability for the yj action, which 
ends the run (here it is either 1 if the state is final and if not). 

The associated system is represented in Table 6 where Lg corresponds to the "Server" 
state. Each line of the system is given by the outgoing probabilities of the corresponding 
state in the SA, or alternatively by the corresponding line of the matrix. Resolving it (see 
Appendix B) yields, Lj = £ for alH e {1, . . . , n - c - 1}, L n - C = 1 - q '^ n ~ n c ~ 1 \ L v = i, 
and Ln = -, — ^ — . Therefore, P(l (n — c)) = -, — ^ — . 

u (n-c)-n ' V V // (n— c)-n 




Fig. 13. SA C r c J|^l 1 ^( n _ c ) corresponding to runs where user 1 initiates the protocol and user 
(n — c) is detected. 

In this case, simple reasoning on the symmetries of the model allows to derive other 
probabilities P(i ~» j). Remark that the probability for a message to go directly from 
initiator to the adversay (who cannot be the server) is — : it only happens if a corrupt user is 
chosen by the initiator. If a honest user is chosen by the initiator, then the length of the path 
will be greater, with probability By symmetry all honest users have equal probability 
to be the initiator, and equal probability to be detected. Hence P(i ~^+) = P(~> j) = —^ r . 



26 



B. Berard, J. Mullins and M. Sassolas 





n 
u 


1 


1 
1 


• 71 — C 71 


„ 1 1 

— c -+- 1 ■ 


71 


Sewer 


/ 

V 








1 

n — c 


• 


■ 





■ 








1' 








n 


n 





• 








-1 

1 


U 


u 


_ 1 
9- n ■ 


„ 1 
' I' n 


U 


U 


U 


U 


n — c — 1 










■ q' k 

^ n 





■ 








n — c 










■ g.i 

* n 






1-9 





n — c + 1 








■ 


■ 





■ 





1 


n 








• 


• 





• 





1 


Server 








■ 


• 





• 





1 



Table 5. The matrix giving the transition probabilities between the states of C ? c J|-4 



l~*(n-c)- 



L v = T n - C i • U 

1 Z_^i— 1 n * 

Li = e:=t s • u 



L n - c = {l-q)-L s +Yr i = 1 l-Li 
L„- c +i — 1 

in = 1 

is = 1 



Table 6. Linear system associated to SA C£| \Ai^( n - c ) of Fig. 13. 



Event i j occurs when i is chosen as the initiator (probability ^z^), and either (1) if 
i = j and i chooses a corrupted user to route its message, or (2) if a honest user is chosen 
and j sends the message to a corrupted user or the server (the internal route between honest 
users before j is irrelevant). Therefore 



P(t ~* j) = 
P(i ~» j) 



1 



n — c 
1 



c 1 n — c 



n n — c n 

c r 



n — c \ n n j 
The case when i is not the initiator is derived from this probability: 



P(-^i) = ^P(^i) 



k=l 

PH - j) = — 

n — c 

Conditional probabilities thus follow: 

P(i - | - j) = 



(!-%)•" + 



P(t - j) 



c n — c — 1 



PH 3) 



n n 
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P(-^^H J ) = ^-^ = (i-d tf ).- + __- 

Interestingly, these probabilities do not depend on q 5 . 



Computation of RPO. From the probabilities above, one can compute an analytical value 
for PO?(C£,1 V41 0). 

1 "~ c 1 

1 n 1 n 

= (n — c — 1) 



n — c n — 1 n — c n — c — 1 
n in — c — 1 1 



Hence 



n — c \ n — 1 n — c — I 
1 _ n • (n 2 + c 2 - 2nc - n + 2c) 

P07(C^~O) ~ (n-c)-(n-l)-(n-c-l) 



A/^c .„ _ (n - c) • (n - 1) • (n - c - 1) 



n • (n 2 + c 2 — 2nc — n + 2c) 

which tends to 1 as n increases to +oo (for a fixed number of corrupted users) . The evolution 
of RPO is represented in Fig. 14(a) where blue means low and red means high. If the 
proportion of corrupted users is fixed, say n = 4c, we obtain 

pC) A (rc n n , (4c-l)-(9c-3) 

po r (c toW ,o)= 4c . (9c _ 2) 

which also tends to 1 as the crowds size increases. When there are no corrupted users, 

which is close to 1, but never exactly, since ifi is not always false, although of decreasing 
proportion as the crowds grows. This result has to be put in parallel with the one from [8], 
which states that crowds is secure since each user is "beyond suspicion" of being the initiator, 
but "absolute privacy" is not achieved. 

Computation of RPSO. From the probabilities computed above, we obtain that if i ^ j, 
V(i | j) — max ( — , J = — max(l, n — 1). 



Except in the case of n = 1 (when the system is non-opaque, hence POf (C°, <p u O) = 0), 
V(i ~» I ~* j) = a=i. 



5 This stems from the fact that the original models had either the server or the corrupt users as 
attackers, not both at the same time. 
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P0S(C*,<Pi,O) 
If 



c 




c. Red meaning a value close to 1 and c = 5. 



blue meaning close to 0. 
Fig. 14. Evolution of restrictive opacity with the size of the crowd. 



In the case when i = j 

fc+l n — c— 1\ 1 

V(i ~+ i) = max , = — max(c + 1, n — c — 1). 

\ n n J n 

That means the vulnerability for the observation class corresponding to the case when i 
is actually detected depends on the proportion of corrupted users in the crowd. Indeed, 
V(i «t | i) = if and only if n < 2(c + 1). The two cases shall be separated. 

When n < 2(c + 1). The message is initially more likely to be sent to a corrupt user or to 
the initiator himself than to any other user in the crowd: 

gp H J) ■ .og(i - va — Hi)) = '^M'g 



(log(n — c — 1) — (n — c) • log(n)) 



?i — c 
log(n — c — 1) 



Hence POf(C£,<^,0) 



- log(n) 
1 



log(n— c— 1) 



lo sW 

When n > 2(c + 1). The message is initially more likely to be sent to a honest user different 
from the initiator: 

£p(~ fl . i„ s( i - vn -hi» = 



1 



(log(c + 1) - (n - c) • log(n)) 

n — c 

log(c+l) , , , 
log(n) 
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Hence P0f(C°,^,O) 



1 



log(n) 



log(c+l) 



n—c 



The evolution of RPSO for c = 5 is depicted in Fig. 14(b). 

One can see that actually the RPSO decreases when n increases. That is because when 
there are more users in the crowd, user i is less likely to be the initiator. Hence the predicate 
chosen does not model anonymity as specified in [8] but a stronger property since RPSO 
is based on the definition of symmetrical opacity. Therefore it is meaningful in terms of 
security properties only when both the predicate and its negation are meaningful. 

7 Dealing with nondeterminism 

The measures presented above were all defined in the case of fully probabilistic finite au- 
tomata. However, some systems present nondeterminism that cannot reasonably be ab- 
stracted away. For example, consider the case of a system, in which a malicious user Alice 
can control certain actions. The goal of Alice is to establish a covert communication channel 
with an external observer Bob. Hence she will try to influence the system in order to ren- 
der communication easier. Therefore, the actual security of the system as observed by Bob 
should be measured against the best possible actions for Alice. Formally, Alice is a scheduler 
who, when facing several possible output distributions {/ii, . . . ,(J. n }, can choose whichever 
distribution v on {1, . . . , n} as weights for the /LtjS. The security as measured by opacity is 
the minimal security of all possible successive choices. 

7.1 The nondeterministic framework 

Here we enlarge the setting of probabilistic automata considered before with nondetermin- 
ism. There are several outgoing distribution from a given state instead of a single one. 

Definition 11 (Nondeterministic probabilistic automaton). A nondeterministic prob- 
abilistic automaton (NPA) is a tuple (E,Q, A,q ) where 

— £ is a finite set of actions; 

— Q is a finite set of states; 

— A : Q — >• V(T>((S x Q) l±J {\/})) is a nondeterministic probabilistic transition function; 

— qo is the initial state; 

where V(A) denotes the set of finite subsets of A. 

The choice over the several possible distributions is made by the scheduler. It does not, 
however, selects one distribution to be used, but can give weight to the possible distributions. 

Definition 12 (Scheduler). A scheduler on A= (£, Q, A, q ) is a function 



such that <r(p)(v) > =>■ v € A(lst(p)). 

The set of all schedulers for A is denoted Sched^ (the dependence on A will be omitted 
if clear from the context). 



a : Run(A) -> V(V{(£ x Q) 1+1 {y/})) 
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Observe that the choice made by a scheduler can depend on the (arbitrarily long) history 
of the execution. A scheduler a is memoryless if there exists a function a' : Q — > T>(T>({E x 
Q)\±){y/})) such that a(p) — a'(ht(p)). Hence a memoryless scheduler takes only into account 
the current state. 

Definition 13 (Scheduled NPA). NPA A = (£, Q, A, qo) scheduled by a is the (infinite) 
FPFA A/ a = (£, Run(A), A', s) where 

p 

A'( P )(a,p') = if p' = 'io ~ - ■ ~ q^q' 

and A' \p){a, p') = otherwise. 

A scheduled NPA behaves as an FPFA, where the outgoing distribution is the set of all 
possible distributions weighted by the scheduler. 

All measures defined in this paper on fully probabilistic finite automata can be extended 
to non-deterministic probabilistic automata. First note that all measures can be defined 
on infinite systems, although they cannot in general be computed automatically, even with 
proper restrictions on predicate and observables. From the security point of view, opacity in 
the case of an NPA should be the measure for the FPFA obtained with the worst possible 
scheduler. Hence the leak evaluated by the liberal measures (LPO and LPSO) is the greatest 
possible, and the robustness evaluated by the restrictive measures (RPO and RPSO) is the 
weakest possible. 

Definition 14. Let A be an NPA, tp a predicate, and O an observation junction. 

For POe {POf,PO^}, P0(A,tp,O)= max PO(A /<T , tp, O). 

For POe {PO^,PO s r }, P0(A,tp,O)= min PO(A /<T , tp, O). 

a^Sched 

7.2 The expressive power of schedulers 

In the context of analysis of security systems running in a hostile environment, it is quite 
natural to consider the scheduler to be under control of the adversary. However if not 
constrained this gives the adversary an unreasonably strong power even for obviously secure 
systems as it can reveal certain secrets. Also several classes of schedulers have been proposed 
in order to avoid considering unrealistic power of unconstrained schedulers and the ability 
of these classes to reach supremum probabilities [32]. We now investigate this problem for 
quantitative opacity. 

First we show that memoryless schedulers are not sufficiently expressive, with the fol- 
lowing counterexample. 

Theorem 2. There exists an NPA B such that the value PO^(B,tp,0) cannot be reached 
by a memoryless scheduler. 
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Proof. Consider the NPA B of Fig. 15. Transitions on a and b going to state q\ (along with 
the westbound y/) are part of the same probabilistic transition indicated by the arc linking 
the outgoing edges (and similarly eastbound) . Let <p be the (regular) predicate consisting of 
runs whose trace projected onto {a, 6} is in (ab) + + (ab)*a (so a and b must alternate). Let 
O be the observation function that keeps the last o; of the run. Hence there are only three 
observables, e, 0\, and o 2 . Intuitively, a scheduler can introduce a bias in the next letter 
read from state go- 



Ol,l o 2 ,l 




Fig. 15. A nondeterministic probabilistic automaton B. 



First consider a memory less scheduler a p . It can only choose once what weight will be 
affected to each transition. This choice is parametrized by probability p that represents the 
weight of probability of the q\ transition. The scheduled NPA Bi a is depicted on Fig. 16(a). 
The probabilities can be computed using the technique laid out in Section 5. We obtain the 
following probabilities (see Appendix C.l for details): 

P(e) = ^ P(o 1 ) = |-P P(oa) = |-(l-p) P(^e) = 

p 5p + 49 l-p 15p + 7 

P( ^' 0l) = 25p*-25p + 58 ' P( ^'° 2) = 25p*-25p + 58 ' 

Which yields 

1 = 1 49 / y l-p 

P0?(B /trp ,cp,O) 8 8 J[P> \7f(p)-5 P -49 7/(p)-30p-14 

with f(p) = 25p 2 — 25p + 58 (see Appendix C.2). It can be shown 6 that regardless of p, 
P0f(B /(Tp ,tp,O) never falls below 0.88. 

Now consider a scheduler o~ m with memory who will try to maximize the realization of 
(f. In order to achieve that, it introduces a bias towards taking the letter which will fulfill 
ip: first an a, then a b, etc. Hence on the even positions, it will choose only transition to q± 
(with probability 1) while it will choose the transition to q2 on odd positions. The resulting 
FPFA is depicted on Fig. 16(b). In this case, the probabilities of interest are: 

PW - \ PW - 1 5 PM -l-l 



With the help of tools such as WolframAlpha. 
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(a) Fully probabilistic finite automaton B/ a (b) Fully probabilistic finite automaton 

Fig. 16. Scheduled automata. 



Probability P(oi) can be obtained by noticing that the execution has to stop after an odd 
number of letters from {a, 6} have been read. The probability of stopping after exactly n 
letters from a or b is g • (|) . Therefore 

1 T-^(7\ 2l+1 1 7 1 1 7 64 7 

p (°i) g-Z^Uj ~ 8'8'l^f ~ 8' 8' 15 ~ 15' 

i>0 v ' 64 

Similar reasoning yield the other probabilities. The computation of RPO from these values 
(see Appendix C.3) gives PO?(B /am , if, O) = ~ 0.60. 

Hence a lower security is achieved by a scheduler provided it has (a finite amount of) 
memory. 

Note that this example used RPO, but a similar argument could be adapted for the other 
measures. 



7.3 Restricted schedulers 

What made a scheduler with memory more powerful than the one without in the coun- 
terexample of Section 7.2 was the knowledge of the truth value of <p and exactly what was 
observed. More precisely, if the predicate and the observables are regular languages repre- 
sented by finite deterministic and complete automata (FDCA), schedulers can be restricted 
to choices according to the current state of these automata and the state of the system. We 
conjecture that this knowledge is sufficient to any scheduler to compromise security at the 
best of its capabilities. 

Let ip C CRun(A) be a regular predicate represented by an FDCA A v . Let O : 
CRun(A) — > {oi,...,o ra } be an observation function such that for 1 < i < n, O _1 (oi) 
is a regular set represented by an FDCA A 0i . Consider the synchronized product A Vi o — 
•ApW-AoAl ■ ■ ■ II-4q„, which is also an FDCA, and denote by Q^o its set of states. Let A V: o(p) 
be the state of A v ,o reached after reading p. 

Definition 15 (Restricted (cp, C)-scheduler). A scheduler a for A is said (ip,0) -restricted 
if there exists a function a' : (Q v ,e> x Q) ~ * T>(T>((E x Q) tfcl {y'})) such that for any run 
p e Run(A), a{p) = o'{A v ^o(.P), lst{p)). 
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Remark that memoryless schedulers are always (ip, (Derestricted. 

Proposition 4. If a is (if, O) -restricted, then A/ a is isomorphic to a finite FPFA. 

These schedulers keep all information about the predicate and the observation. We conjec- 
ture that the relevant supremum is reached by a (tp, O) -restricted scheduler. 

Proof (Sketch of proof of Proposition 4 )■ It can be shown that if a is (cp, (Derestricted, then: 

(1) a is a memoryless scheduler for the product .411.4^,0 

(2) &nd(A\\A lp ,o) /lT =A /a . 

8 Conclusion 

In this paper we introduced two dual notions of probabilistic opacity. The liberal one mea- 
sures the probability for an attacker observing a random execution of the system to be able 
to gain information he can be sure about. The restrictive one measures the level of certitude 
in the information acquired by an attacker observing the system. The extremal cases of both 
these notions coincide with the possibilistic notion of opacity, which evaluates the existence 
of a leak of sure information. These notions yield measures that generalize either the case 
of asymmetrical or symmetrical opacity, thus providing four measures. 

However, probabilistic opacity is not always easy to compute, especially if there are 
an infinite number of observables. Nevertheless, automatic computation is possible when 
dealing with regular predicates and finitely many regular observation classes. A prototype 
tool was implemented in Java, and can be used for numerical computation of opacity values. 

In future work we plan to explore more of the properties of probabilistic opacity, to 
instantiate it to known security measures (anonymity, non-interference, etc.). Also, we want 
to extend the study of the non-deterministic case, by investigating the expressiveness of 
schedulers. 
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A Computation of RPO for the debit card system 

We give here the details of the computation of RPO in the example of the debit cards system 
of Section 4.1. 



P(Ocaii = e) = P(OcaU = e, as > 1000) + P(0 Ca ii = e, 500 < x < 1000) 
+ P(O c «u = e, 100 < x < 500) + P(0 Call = e, x < 100) 
= P(Ocwi = e) ' P(* > 1000) 

+ P(e> C aii = e) ■ P(500 < x < 1000) 
+ P(0 c «u = e) ■ P(100 < x < 500) 
+ P(OcaU = e) • P(a: < 100) 
= 0.05 • 0.05 + 0.25 • 0.2 + 0.5 • 0.45 + 0.8 • 0.3 
P(0 Ca u = e) = 0.5175 



P(Ocaii = Call) = 1-P(0 0a u = e) = 0.4825 
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P(^>5Oo|0Call =£) = 



P(- , V> 



0Ca.ll = S) 



500, ^Call 



P(0 C all = e) 

PQr < 100, Ocaii = g) + P(100 < g < 500, Q C aii = s) 

P(Ocaii = e) 
P(0 C aii = e|z < 100) • P(x < 100) 
Ppcau = e) 
P(C C aii = e|100 < x < 500) • P(100 < x < 500) 



P(C 



Call 



>500|Ccall — e) — 



0.8-0.3 + 0.5-0.45 
0.5175 

°- 465 0.899 



0.5175 



P(-V>500|O. 



Call 



Call) 



P(^^>50o|C'ca]] 



P(^ t P>5QO,0Call = Call) 

P(Ocaii = Call) 
P(a; < 100,e> C aii = Call) + P(100 < x < 500,O C aii = Call) 

P(Ocaii = Call) 
P(0Caii = Call] a; < 100) • P(x < 100) 
P(Ocaii = Call) 
P(0 Ca ii = Call|100 < x < 500) ■ P(100 < x < 500) 



P(Ocaii = Call) 



0.2-0.3 + 0.5 • 0.45 



0.4825 



Call) 



0.285 
0.4825 



0.591 



1 0.5175 0.4825 
2 = 0.5175 + 0.4825 

PO^(Aard,^>500,Ocall) 0.465 0.285 

1 39377 



PO^(Aard,^>500,OCall) 28272 



1.393 



The last line was obtained by reducing the one above with the help of the formal computation 
tool WolframAlpha. 



B Resolution of the linear system for Crowds protocol 

It can be seen in the system of Table 6 (page 26) that L\ = L2 = ■ ■ ■ = L n _ c _i = q ■ Ly 
and L„_ c +i = • • • = L n = L5 = 1. Therefore, it suffices to eliminate Ly and compute Lq, 
Li and L n _ c . 

L o = 9 („_ c) • Li 
Lx = q^=f^-L 1 + l.L n . c ) 

Ln-c = 1 - ^ + Li 
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The line for L„_ c is obtained as follows: 

n 

L n -c = (1 - q) ■ L s + - ■ Li 



n—c 



L n - C = 1 - q + Li -+ 

q(n — c) 
n 



i=X 

q ■ c 
n 



i—n—c+X 



L n _ c ={l-q)-L s + Y j i L-L i + V — • L 



This yields, for L\\ 



n 

q 



Li 





= q -i 
n 


Lx 


= (n 


>) 


n 




Lx 


_ q 

n 



q f 1 _ q( n - c) 



q(n - c) 



The other values are easily deduced from L\. 



C Calculations in the proof of Theorem 2 
C.l Probabilities in 2?/ CTp . 

We compute the probabilities of several events in the automaton B/ a , reproduced on 
Fig. 17(a). Recall that ip = (ab) + + (ab)*a and O is the last letter read, so the set of 
observables is Obs = {e, o\, 02}. 

The computation of the probability P(tp, o\) in FPFA B/„ p goes as follows. We write 
p = 1 — p for brevity. First we build the synchronized product 23/ . p ||.A v ||./4 0l , as depicted 
in Fig. 18. The linear system of Table 7 is built from this automaton. This system can be 
trimmed down in order to remove redundancy, and since only the value of xooo = P(v?) °i) 
is of interest: 



We therefore obtain: 
f ^000 = hp 



xqoo = hp x ia + \p xqxi 
xqio = |P 2^021 + f P X 20 
XOIX = § + f P XQ20 + \P XQ21 
2^020 = tP ^010 + \p X ii 

I 2:021 = g + \p £010 + \p x m \ 



a^ooo = f P zoio ^ 
^010 = |P ^021 ^ 
a^on = § + ^010 
2^020 = ^ooo 

t ^021 = § + ^000 



%P %mx 
\p X020 



f ^ooo — hp x oio + \p (§ + ^010) g Q I ^000 
l^oio = +^000) + iP xqoo ( ' 1 



+ xqxo) go f £000 = hp x xo + jP (g + »oio) 
f P x Qm \ x m <3 = uP+\p x mo + lp xqoo 
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' xooo = 


|p ^210 + |p X110 


X210 = 


£010 


XllO = 


^011 


2?010 = 


FP £120 + f p CC220 


Soil = 


1 + §p £221 + lp K121 


%120 = 


^021 


< £220 = 


^020 


£221 = 


^020 


5C121 = 


^021 


X020 = 


lp X210 + §P £110 


X021 = 


| + |p £211 + |p asm 


X211 = 


^010 


. Sill = 


iron 



Table 7. Linear system associated to the SA B/<r H-A^H-Aoi- The variables names indicate 
the corresponding state in the automaton; for example X210 corresponds to state (q2,Pi, r o)- 
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Fig. 18. Substochastic Automaton B/ a \\A V \\A, 
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1/1 1 3 \ 3 /l 1 



As a result 



• = zl> 1 q^P + gP x ooo + ^P a^ooo J + ^P + ~^P + a^ooo + ~^P xooo 

In the sequel, we replace xqqq with x for readability's sake. 

1/1 1 3_ \ 3 /l 1 1 3_ 

x = ik™ + k m + h fx + y + ie p2 + h 2x + h m 

1 3 ^2 ^2 ^ i ^2 



'"64*-^ "32 P ~16^J = 512^ + 32 ? ' + 256 P 
ipp + 6p + | p 2 



64 — pp — 6p — 6p 2 — 36pp 
§p(l-p) + 6p+§ 



64 - 37p(l - p) - 6(p 2 - 2p + 1) - 6p 2 

|g ~ \p 2 + 6p + f p 2 

64 - 37p + 37p 2 - 6p 2 + 12p - 6 - 6p 2 

1 5p + 49 

■p- 



8 25p 2 - 25p + 58 



The same technique can be applied to the computation of P(y>, 02) in . The product 
is depicted on Fig. 19, and the linear system obtained boils down to 

£000 = hp (g + x 010 ) + |p £010 
£010 = gP sooo + fp" (| + £000) 

As before, xooo is replaced by a; for readability; we solve: 

1/11 3 /l \\ 3 /l 3 



^8 + 8 p:c+ 4 p u ;; + 4 p u p i p 

1 1 3 n 3 o 3 n 9 9 

— p H ppx H p H p iH p x -\ pp H zwa; 

64^ 64^ 256^ 32^ 32^ 128^ 16^ 



1 - hpp ~ hp 2 ~ i? 2 ^ &PP 

4p + 3p 2 + 18pp 
256 - 24p 2 - 24p 2 - 148pp 

(l-p)(4 + 3-3p+18p) 

256 - 24p 2 + 48p - 24 - 24p 2 - 148p + 148p 2 
(l-p)(15p + 7) 



232 - lOOp + 100p 2 

(l-p)(15p + 7) 
4(25p 2 - 25p + 58) 
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(qo,Po,so) 




( V,Pi,si) (qi,Pi,si) 



Fig. 19. Substochastic Automaton B/ a \\A V \\A, 
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C.2 Computation of RPO for B /rTp 

In the sequel, we write f(p) — 25p 2 — 25p + 58. We have P(Jp\s) — 1 and 
P(^|0i) = 1 - - -p- 



P(?>| 01 ) - 



8 " 25p 2 -25p + 58 7p 
7/(p) - 5p - 49 



P( ^° 2) = 1 P(o 2 ) 



P(^|o 2 ) = 1 

P(P|02) 



7/(p) 
P(y,o 2 ) 

P(02) 

(l-p)(15p + 7) 



4(25p 2 - 25p + 58) 7(1 - p) 
7f(p) - 30p - 14 
7/(p) 



1 






1 




PO?(B /(7p; 




1 




PO^(S /lTp . 





p < £ > +p <°'»-F5w +p(02) -5(?ra 



7/(p) - 30p 



1 | 49/(p) f p , 1-p 



C.3 Computation of RPO for B /rTm 

We have: 

. / i \ . / i % 3 15 53 _. , i , 3 3 15 

Pte £ = 1 ¥(W Oi) = 1 = — Pfe 2 =1 

m y ij 14 ? g8 m 2.) 4 14 7 



Therefore: 



1 _ 1 7 98 7 7 343 

P0?(B /am ,<p,O) "8 + 15'53 + 8'l5'208 
1 146509 



P0?(B /am ,^O) 88192 
PO^(S /CTm ,^,O)~0.60 



